Allelematch: an R package for identifying unique multilocus genotypes where genotyping error and missing data may be present.
نویسندگان
چکیده
We present allelematch, an R package, to automate the identification of unique multilocus genotypes in data sets where the number of individuals is unknown, and where genotyping error and missing data may be present. Such conditions commonly occur in noninvasive sampling protocols. Output from the software enables a comparison of unique genotypes and their matches, and facilitates the review of differences between profiles. The software has a variety of applications in molecular ecology, and may be valuable where a large number of samples must be processed, unique genotypes identified, and repeated observations made over space and time. We used simulations to assess the performance of allelematch and found that it can reliably and accurately determine the correct number of unique genotypes (± 3%) across a broad range of data set properties. We found that the software performs with highest accuracy when genotyping error is below 4%. The R package is available from the Comprehensive R Archive Network (http://cran.r-project.org/). Supplementary documentation and tutorials are provided.
منابع مشابه
The use of family relationships and linkage disequilibrium to impute phase and missing genotypes in up to whole-genome sequence density genotypic data.
A novel method, called linkage disequilibrium multilocus iterative peeling (LDMIP), for the imputation of phase and missing genotypes is developed. LDMIP performs an iterative peeling step for every locus, which accounts for the family data, and uses a forward-backward algorithm to accumulate information across loci. Marker similarity between haplotype pairs is used to impute possible missing g...
متن کاملImputing missing genotypes: effects of methods and patterns of missing data
Costs of high-throughput genotyping have decreased to the point where it appears economically feasible to use molecular genetic marker information in applied breeding programs. Some practical questions remain to be addressed about how best to deal with missing data in the resulting genotype datasets, to minimize the impact of the missing data on the accuracy of breeding value prediction. Data c...
متن کاملInferring Haplotypes from genotypes on a Pedigree with mutations, genotyping Errors and Missing Alleles
Inferring the haplotypes of the members of a pedigree from their genotypes has been extensively studied. However, most studies do not consider genotyping errors and de novo mutations. In this paper, we study how to infer haplotypes from genotype data that may contain genotyping errors, de novo mutations, and missing alleles. We assume that there are no recombinants in the genotype data, which i...
متن کاملHaplotype sharing transmission/disequilibrium tests that allow for genotyping errors.
The present study introduces new Haplotype Sharing Transmission/Disequilibrium Tests (HS-TDTs) that allow for random genotyping errors. We evaluate the type I error rate and power of the new proposed tests under a variety of scenarios and perform a power comparison among the proposed tests, the HS-TDT and the single-marker TDT. The results indicate that the HS-TDT shows a significant increase i...
متن کاملGenotype List String: a grammar for describing HLA and KIR genotyping results in a text string
Knowledge of an individual's human leukocyte antigen (HLA) genotype is essential for modern medical genetics, and is crucial for hematopoietic stem cell and solid-organ transplantation. However, the high levels of polymorphism known for the HLA genes make it difficult to generate an HLA genotype that unambiguously identifies the alleles that are present at a given HLA locus in an individual. Fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Molecular ecology resources
دوره 12 4 شماره
صفحات -
تاریخ انتشار 2012